Model-Based Pose Estimation

نویسندگان

  • Gerard Pons-Moll
  • Bodo Rosenhahn
چکیده

Model-based pose estimation algorithms aim at recovering human motion from one or more camera views and a 3D model representation of the human body. The model pose is usually parameterized with a kinematic chain and thereby the pose is represented by a vector of joint angles. The majority of algorithms are based on minimizing an error function that measures how well the 3D model fits the image. This category of algorithms usually has two main stages, namely defining the model and fitting the model to image observations. In the first section, the reader is introduced to the different kinematic parametrization of human motion. In the second section, the most commonly used representations of the human shape are described. The third section is dedicated to the description of different error functions proposed in the literature and to common optimization techniques used for human pose estimation. Specifically, local optimization and particle-based optimization and filtering are discussed and compared. The chapter concludes with a discussion of the state-of-the-art in model-based pose estimation, current limitations and future directions. 9.1 Kinematic Parametrization In this chapter our main concern will be on estimating the human pose from images. Human motion is mostly articulated, i.e., it can be accurately modeled by a set of connected rigid segments. A segment is a set of points that move rigidly together. To determine the pose, we must first find an appropriate parametrization of the human motion. For the task of estimating human motion a good parametrization must have the following attributes. G. Pons-Moll ( ) · B. Rosenhahn Leibniz University, Hanover, Germany e-mail: [email protected] B. Rosenhahn e-mail: [email protected] T.B. Moeslund et al. (eds.), Visual Analysis of Humans, DOI 10.1007/978-0-85729-997-0_9, © Springer-Verlag London Limited 2011 139 140 G. Pons-Moll and B. Rosenhahn Attributes of a good parametrization for human motion: • Pose configurations are represented with the minimum number of parameters. • Human motion constraints, such as articulated motion, are naturally described. • Singularities can be avoided during optimization. • Easy computation of derivatives of segment positions and orientations w.r.t. the parameters. • Simple rules for concatenating motions. A commonly used parametrization that meets most of the above requirements is a kinematic chain, which encodes the motion of a body segment as the motion of the previous segment in the chain and an angular motion about a body joint. For example, the motion of the lower arm is parametrized as the motion of the upper arm and a rotation about the elbow. The motion of a body segment relative to the previous one is parametrized by a rotation. Parameterizing rotations can be tricky since it is a non-Euclidean group, which means that if we travel any integer number of loops around an axis in space we will end up in the same point. We now briefly review the different parametrization of rotations that have been used for human tracking. 9.1.1 Rotation Matrices A rotation matrix R3×3 is an element of SO(3). Elements of R ∈ SO(3) are the group of 3 × 3 orthonormal matrices with det(R)= 1 that represent rotations [34]. A rotation matrix encodes the orientation of a frame B that we call body frame relative to a second one S that we call spatial frame. Given a point p with body coordinates, pb = (λx, λy, λz) , we might write the point p in spatial coordinates as ps = λxxBs + λyyBs + λzzBs , (9.1) where xs , y B s , z B s are the principal axis of the body frame B written in spatial coordinates. We may also write the relationship between the spatial and body frame coordinates in matrix form as ps = Rsbpb . From this it follows that the rotation matrix is given by Rsb = [ xs y B s z B s ] . (9.2) Now consider a frame B whose origin is translated w.r.t. frame S by ts (the translation vector written in spatial coordinates). In this case, the coordinates of frames S and B are related by a rotation and a translation, ps = Rsbpb + ts . Hence, a pair (R ∈ SO(3), t ∈ R3) determines the configuration of a frame B relative to another S and is the product space of R3 with SO(3) denoted as SE(3) = 9 Model-Based Pose Estimation 141 Fig. 9.1 Left: rigid body motion seen as a coordinate transformation. Right: rigid body motion seen as a relative motion in time R 3 × SO(3). Elements of SE(3) are g = {R, t}. Equivalently, writing the point in homogeneous coordinates p̄b = [pb 1 ] allows us to use the more compact notation p̄s = Gsbp̄b, where Gsb = [ Rsb[3×3] ts [3×1] 0[1×3] 1 ]

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

استفاده از برآورد حالت‌های پویای دست مبتنی بر مدل، برای تقلید عملکرد بازوی انسان توسط ربات با داده‌های کینکت

Pose estimation is a process to identify how a human body and/or individual limbs are configured in a given scene. Hand pose estimation is an important research topic which has a variety of applications in human-computer interaction (HCI) scenarios, such as gesture recognition, animation synthesis and robot control. However, capturing the hand motion is quite a challenging task due to its high ...

متن کامل

Camera Pose Estimation in Unknown Environments using a Sequence of Wide-Baseline Monocular Images

In this paper, a feature-based technique for the camera pose estimation in a sequence of wide-baseline images has been proposed. Camera pose estimation is an important issue in many computer vision and robotics applications, such as, augmented reality and visual SLAM. The proposed method can track captured images taken by hand-held camera in room-sized workspaces with maximum scene depth of 3-4...

متن کامل

Model Based Pose Estimation Using SURF

Estimation of a camera pose (position and orientation) from an image, given a 3d model of the world, is a topic of great interest in many current fields of research. When aiming for a model based pose estimation approach, several questions arise: What is the model? How do we acquire a model? How is the image linked to the model? How is a pose computed and verified using the latter information? ...

متن کامل

Model-based Pose Estimation in an Urban Environment

This paper addresses pose estimation problem for augmented reality applications in an urban environment. The pose is estimated based on the known wire-frame models of buildings and line segments detected on the images. One of key problems for model-based pose estimation is to establish the correspondence between the model and detected features. The main contribution of this paper is a robust ma...

متن کامل

Towards Multilevel Human Body Modeling and Tracking in 3D: Investigation in Laplacian Eigenspace (LE) Based Initialization and Kinematically Constrained Gaussian Mixture Modeling (KC-GMM)

Vision-based automatic human body pose estimation has many potential applications and it is also a challenging task. Together, these two factors have made vision-based human body pose estimation an attractive research area with closely related research areas including body pose, hand pose, and head pose estimation. Up to now, these research works however only deal with each task of estimating b...

متن کامل

2D-3D Pose Consistency-based Conditional Random Fields for 3D Human Pose Estimation

This study considers the 3D human pose estimation problem in a single RGB image by proposing a conditional random field (CRF) model over 2D poses, in which the 3D pose is obtained as a byproduct of the inference process. The unary term of the proposed CRF model is defined based on a powerful heat-map regression network, which has been proposed for 2D human pose estimation. This study also prese...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011